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Abstract. It is a classical result that any finite tree with positively 
weighted edges, and without vertices of degree 2, is uniquely determined 
by the weighted path distance between each pair of leaves. Moreover, 
it is possible for a (small) strict subset C of leaf pairs to suffice for 
reconstructing the tree and its edge weights, given just the distances 
between the leaf pairs in C. It is known that any set C with this property 
for a tree in which all interior vertices have degree 3 must form a cover 
for T - that is, for each interior vertex v of T, C must contain a pair 
of leaves from each pair of the three components of T — v. Here we 
provide a partial converse of this result by showing that if a set C of leaf 
pairs forms a cover of a certain type for such a tree T then T and its 
edge weights can be uniquely determined from the distances between the 
pairs of leaves in C. Moreover, there is a polynomial-time algorithm for 
achieving this reconstruction. The result establishes a special case of a 
recent question concerning 'triplet covers', and is relevant to a problem 
arising in evolutionary genomics. 
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cover. 
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1. Introduction 

Any tree T with positively weighted edges, induces a metric d on the set 
of leaves by considering the weighted path distance in T between each pair of 
leaves. Moreover, provided T has no vertices of degree 2, and that we ignore 
the labeling of interior vertices, both T and its edge weights are uniquely 
determined by the metric d. This uniqueness result has been known since 
the 1960s and fast algorithms exist for reconstructing both the tree and 
its edge weights from d (for further background the interested reader may 
consult m and [9] and the references therein). 

The uniqueness result and the algorithms are important in evolutionary 
biology for reconstructing an evolutionary tree of species from genetic data 
[BJ. However in this setting one frequently may not have d- values available 
for all pairs of species, due to the patchy nature of genomic coverage [8j. 

This raises a fundamental mathematical question - for which subsets of 
pairs of leaves of a tree do we need to know the d- values in order to uniquely 
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recover the tree and its edge weights? In general this appears a difficult 
question (indeed determining whether such a partial d-metric is realized 
by any tree is NP-hard [H]). However, some sufficient conditions (as well as 
some necessary conditions) for uniqueness to hold have been found, in [3"lll2|. 
and more recently in flj, and [7J. In this paper we consider the uniqueness 
question for trees that are 'fully-resolved' (i.e. all the interior vertices have 
degree 3) as these trees are of particular importance in evolutionary biology, 
and because the uniqueness question is easier to study for this class of trees. 

The structure of this paper is as follows. First we introduce some back- 
ground terminology and concepts, and then we define the particular type 
of subsets of leaf pairs (called 'stable triplet covers') which we show suffice 
to uniquely determine a fully-resolved tree. Moreover, we show how this 
comes about by establishing two combinatorial properties of stable triplet 
covers - a 'shellability' property and a graph-theoretic property related to 
tree-width, which we show is quite different to shellability. We conclude by 
providing a proof that a polynomial-time algorithm will reconstruct a tree 
and its edge weights for any set of leaf pairs that contains a stable triplet 
cover (or more generally a shellable subset). Our result answers a special 
case of the question posed at the end of [1] of whether every 'triplet cover' 
of a fully-resolved tree determines the tree and its edge weights. 

2. Preliminaries 

We now introduce some precise definitions required to state and prove 
our main results. We mostly follow the notation and terminology of [9] and 

2.1. X— trees, edge- weightings and distances. For the rest of the pa- 
per, assume that \X\ > 3. An X— tree T = (V, E) is a graph theoretical tree 
whose leaf set is X and which does not have any vertices of degree 2. We call 
an X— tree fully-resolved if every interior vertex of T, that is, every non-leaf 
vertex of T, has degree three. Moreover, we call two distinct leaves x and y 
of T a cherry of T, denoted by x, y, if the parent of x is simultaneously the 
parent of y. For any subset Y C X, we denote by T\Y the Y-tree obtained 
by restricting T to Y (suppressing resulting degree two vertices). 

An example of a fully-resolved X— tree for X = {a,b,c,d,e, f, g}, and 
having two cherries, is shown in Fig. []Ji). 

In case \Y\ = 4, say Y = {a,b,c,d}, and the path from a to b does not 
share a vertex with the path from c to d in T\Y, we refer to T\Y as a quartet 
tree and denote it by ab||cd. Note that by deleting any edge e S E from 
T the leaf sets A e and B e := X — A e of the resulting two trees induce a 
bipartition of X. We refer to such a bipartition as X— split and denote it 
by A\B where A := A e and B := B e and A e and B e are as above. We say 
that two X— trees T = (V, E) and T' = (V, E') are equivalent if there exists 
a bijection <f> : V — > V that is the identity on X and extends to a graph 
isomorphism from T to T'. 
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Suppose for the following that T = (V,E) is an X— tree. Then we call a 
map w : E — )• M>o that assigns a weight, that is, a non-negative real number, 
to every edge of T an edge-weighting for T. Note that this definition implies 
that some of the edges of T might have weight zero. We denote an X— tree 
T together with an edge-weighting w by the pair (T, w) and call an edge- 
weighting that assign non-zero weight to every edge of T that is not incident 
with a leaf of T proper. Note that for any edge-weighting w of T, taking 
the sum of the weights of the edges on the shortest path from some x G X 
to some y £ X induces a distance between x and y and thus a distance 
d = d(r,u)) on X. 

For example, in the tree in Fig. [iji), if each edge has weight 1, then 
d(a, b) = 2, d(c, e) = 4, and d(c, f) = 5. 




(i) (ii) 

FIGURE 1. (i) A fully-resolved tree X— tree T for X = 
{a,b,c,d,e, f, g}; (ii) the graph (X, C) corresponding to a 
strong lasso C for T (discussed further in Example 1). 



2.2. Lassos. We call a subset of X of size two a core? of X and, for a,b £ X 
distinct write ab rather than {a, b} for the cord containing a and b. Also, for 
any non-empty set £ C ( j of cords of X, we denote the edges of the graph 
(X, C) whose vertex set is X and whose edge set is the set {{a, b} : ab £ C} 
by ab rather than {a, b}, ab £ C. 

Suppose for the following that C C ( 2 J is a non-empty set of cords of 
X. If T" = (y,£") is a further X— tree and u; and w' are edge-weightings 
for T and T', respectively, such that dt TjW \(x, y) = dr T i jW i\(x,y) holds for all 
xy £ £ then we say that (T,w) and (T',w') are C-isometric. Moreover we 
say that C is 

(i) an edge-weight lasso for T if for any two proper edge-weightings w 
and w' for T such that (T, u>) and (T, w') are £-isometric we have 
that w = w' . 

(ii) a topological lasso for T if for any other X— tree T' and any two 
proper edge- weightings w and ti/ for T and T", respectively, such 
that (T,w) and (T',w') are £-isometric we have that T and T" are 
equivalent. 
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(iii) a strong lasso for T if £ is simultaneously an edge-weight and a 
topological lasso for T. 

If £ is a strong lasso for an A— tree then the graph (A, £) must be connected, 
and each component of this graph must be non-bipartite [I] . An example of 
a strong lasso £ of the tree in Fig. [Tjl) is the set of cords corresponding to 
the edges of the graph in Fig. [T](ii) . 

2.3. Shellability. Given a subset £ of (^) with X = |J £, and an X— tree 
T, we say that (^) — £ is T -shellable if there exists an ordering of the cords in 
(^) — £ as, say, a\b\, a2&2> • • • , a, m b m such that, for every fj, E {1,2, ... , m}, 
there exists a pair of 'pivots' for a^fr^, i.e., two distinct elements 
x^i, y^i ^ A — {a^, 6^}, for which the tree T|Y^ obtained from T by restriction 
to y M := {a M , & M , x^, y^}, is the quartet tree a^x^\\y^b^, and all cords in ( 2 M ) 
except a^bfj, are contained in £ M := £U {oy ?y : // G {1, 2, . . . , /x— 1}}. Any 
such ordering of ( 2 ) — £ will also be called a shellable ordering of ( 2 ) — £, 
and any subset £ of ( 2 J for which a shellable ordering of ( 2 J — £ exists will 
also be called an shellable lasso for T. In [4« Theorem 6], it was established 
that every shellable lasso for an X— tree is in particular a strong lasso for 
that tree. 

2.4. Example 1. Consider the seven-taxon tree, shown in Fig. [lji) , and 
the lasso £ = {ab, bd, ad, be, bf, ag, dg, eb, ef, fg, gc} (the edges of the graph 
in Fig. [l|ii)). The remaining ten chords in ("Jf) — £ have a shellable ordering, 
described as follows: 

bg, cd, ac, cf, ce, af, df, ae,eg, ed, 
where the corresponding cord pivots are: 

(a, d), (b, g), (b, d), (b, g), (&, /), (b, g), (b, g), (b, /), (a, /), (6, /), 
and so £ is a shellable (and hence strong) lasso for T. □ 

2.5. Covers, triplet covers. A necessary condition for £ C (^ to be a 

edge-weight lasso or a topological lasso for a fully-resolved X— tree is that 
£ forms a cover for T - that is, for each interior vertex v of T, £ contains 
a pair of leaves from each pair of the three components of T — v. However 
this condition is not sufficient for £ to be either an edge-weight lasso or a 
topological lasso (examples are given in [1]). 

A particular type of cover for a fully-resolved A— tree is a triplet cover 
which is defined as any subset £ of ( 2 J with the property that for each 
interior vertex v of T we can select leaves a, b, c from each of the three 
components of T — v so that ab, ac, be € £. It can be shown that if £ is a 
triplet cover for a fully-resolved A— tree T then £ is an edge- weight lasso. 
However it is not known whether or not every triplet cover of every such T 
is also a topological (and thereby a strong) lasso for T. 
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3. A SPECIAL CLASS OF TRIPLET COVERS 
Suppose that T = (V, E) is a fully-resolved X— tree, and let 
clus(T) := ]J{A e ,X-A e }, 

where A e |(A — A e ) denotes the X— split associated with edge e £ E. We 
call the elements in clus(T) 'clusters' (in biology, they are also sometimes 
referred to as 'clans' [II]). Thus a cluster is a subset of X that corresponds 
to the leaf labels on one side of some edge of T. 

Given a collection C of non-empty subsets of X we say that any function 
/ : C — > X is a stable transversal for C if it satisfies the two properties: 

• (transversality) f(A) 6 A, for all A e C; 

• (stability) f(A) £ B C A ==> /(A) = /(B) for all A, B £ C. 

Mostly we will be concerned with stable transversals for clus(T), which 
were introduced in [2], though for a different purpose. 

3.1. Example 2. An example of a stable transversals for clus(T) is as fol- 
lows: Consider any stable transversal g for 2 X (equivalently, the function 
g{A) = min^4 under some total ordering of X), and consider any proper 
edge weighting w of T. For a cluster A G clus(T), consider the subset A w 
of leaves of T in A that are a closest to the edge e whose deletion induces 
the split A\X — A. Here 'closest' refers to the path distance in T from each 
leaf in A to e under the edge weighting w. If we let f(A) = g(A w ), for 
each A 6 clus(T) then / is a stable transversal for clus(T). Notice that this 
holds also for the corresponding function in which 'closest' is replaced by 
'furthest' throughout. □ 

3.2. Example 3. Consider the fully-resolved A— tree shown in Fig. [2^i), 
and the function / defined as follows: f({x}) = x for all x in X, and 

f({a,a'}) = a,f({b,b'}) = b,f({c,c'}) = c 

and 

f(X - {a, a'}) = b, f(X - {b, b'}) = c, f(X - {c, c'}) = a. 

Then / is a stable transversal for T. Note that the choices of b, c, a in the 
last line could be replaced by, for example, c, a, b or c, c, a and we would still 
have a stable transveral. □ 

4. Stable triplet covers are minimal strong lassos for T 

Given a fully-resolved A— tree T, a stable transversal / of clus(T) defines 
a triplet cover for T as follows: For each interior vertex v of T, consider the 
three components of the graph T — v, and let A] ] ,A^ ) ,A^ J denote their leaf 
sets. Then let 

£(T,/) : = U {f(K)f(A 2 v )J{Al)f(Al)J{Al)f(Al)} 
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where Vi nt denotes the set of interior vertices of T. We say that £ is a stable 
triplet cover (generated by /) if C = £(tj) f° r some stable transversal / of 
clus(T). For example, for the pair (T, /) described in Example 3, we have: 

£(T,/) = { a b, ac, be, aa, a'b, bb ' , b'c, cc , c'a}, 

and the graph (X, C) for C = £(t,/) ls shown in Fig. J^ii). Notice that 
not all triplet covers are stable; indeed the set of triplet covers of a fully- 
resolved X— tree T is precisely the set of subsets of ( 2 J of the form £(t,/) 
where / is required to satisfy only the transversality property above for some 
/ : clus(T) X. 



4.1. 2d-trees. Interestingly, Fig. |2pi) shows that for the set C = £(t,/) 
with T and / from Example 3, the graph (X,C) is a 2cf-tree, where a graph 
G = (V, E) is called a Id-tree if there exists an ordering x\,%2, ■ ■ ■ , x n of V 
such that {x%, x%} £ E and, for i = 3, . . . , n the vertex Xi has degree 2 in the 
subgraph of G induced by {x\,X2, ■ ■ ■ , x^}. 2d-trees are examples of kd-tiees 
which were characterized in [10] and also studied in e. g. [7]. In Fig. [2](ii) an 
acceptable vertex ordering is a,b,c,a' ,b' ,c' . The graph in Fig. [jjii) is also 
a 2d-tree, as can be seen by considering the vertex ordering a, b, d, g, c, f, e. 

4.2. Main result. We can now state our first main result which relates 
stable triplet covers with 2d-trees and shellable lassos. 

Theorem 4.1. If C is a stable triplet cover of a fully-resolved X—tree T 
with n := \X\ > 3, then 

(i) (X, C) is a 2d-tree. 

(ii) C is an shellable lasso for T , and so £ is a strong lasso for T. 

(iii) \C\ = 2n — 3, and so C is also a minimal strong lasso for T . 
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Proof. We prove parts (i)-(iii) simultaneously by induction on n = \X\. 
Shellability holds trivially for n = 3 (since then (^) — £ = 0), so suppose 
that it holds when n = k > 3, and that T is a fully- resolved tree with k + 1 
leaves, and that £ is a triplet cover for T generated by a stable transversal 
/ of clus(T). Select any cherry x, y of T. Without loss of generality, we may 
suppose that f({x, y}) = x. Let 

z := f(X - {x, y}),X' := X - {y}, T := T\X', £':=£- {xy, yz}, 

and define /' : clus(T) — > X by setting 



f'(A) = 



f(A), ifx^A; 
f(Au{y}), ifxG A. 



Note that, since / is a stable transversal for clus(T), it follows that y is not 
an element of any cord of £', and so £ C { X 2 ). Moreover, y / f'(A) for 
any A G clus(T'), and so /' : clus(T') — > X' . It can now be checked that 
/' is a stable transversal for clus(T') and so £ is a stable triplet cover of 
T', generated by /'. By the inductive hypothesis (applied to T' and £) it 
follows with regards to (i) that (X', £) is a 2d-tree. Clearly adding y to the 
vertex set of that graph and xy and zy to its edge set preserves the 2d-tree 
property. By the definition of £ it is easy to see that the resulting graph is 
(X,C). 

Note that regarding (ii) and (iii) the induction hypothesis implies that 
\£\ =2k-3, and so \C\ = 2(k + 1) - 3 and that (^') - £ is shellable. So 
let us fix an ordering of ) — £ that provides such a shelling. This will 
form the initial segment of a shellable ordering of (^) — £ 

To describe this extended ordering, let v be the interior vertex of T ad- 
jacent to leaves x and y, and let u be the interior vertex of T adjacent to v. 
Consider the three components of the graph T — u. One component contains 
x, y, and we will denote the leaf sets of the other two components by X2 
and X3, where, without loss of generality, z G X3. Notice that (^) — C is 
the disjoint union of the three sets: 

) - £'> fa -t € ^2} and {ty : i G X 3 - {z}}. 

2 J — £ as follows: the elements of ( 2 J — £ come first, ordered 
by their shellable ordering, followed by the elements ty with t £ X2 (in any 
order), followed by the elements ty with t £ X3 — {z} (in any order). 

We claim that any such ordering provides a shellable ordering of — £ 
To see this, observe first that, for any leaf t G X2, the elements x, z provide 
'pivots' for the pair t, y, since T\{x, y, z, t} = xy\\zt and all cords in (^'^ 2 '*^ 

except ty are contained in £ U ((^ ) — £). Also, for any leaf t G X3, if we 
select any leaf z' G X2 then the pair x,z' provides a 'pivot' for t,y, since 
T|{x,y,z',t} = xy\\z't, and all cords in (i^'^) except are contained in 
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£ U ((*') - £') U {t'y : t' G X 2 }. In all cases, the cords required for pivoting 
come earlier in the ordering. 

Thus, we have established that £' is an shellable lasso for T, and so, by 
Theorem 6 of £ is also a strong lasso for T. Moreover, we showed that 
|£| = 2\X\ — 3, and since this equals the number of edges in any fully- 
resolved X-tree, linear algebra ensures that no strict subset of £' could be 
an edge weight-lasso for T. Hence, £ is a minimal strong lasso for T, which 
completes the proof of the induction step, and thereby of the theorem. 

□ 



4.3. Remarks. 

(1) Just because a graph (X, £) is a 2d-tree, it does not follow that 
£ forms a strong (let alone a shellable) lasso for every given fully- 
resolved X— tree T. A simple example is furnished by X = {a, b, c, d} 
and £ = {ab,ac,bc,ad,bd}, for which (X, £) is a 2d-tree, and yet £ 
fails to be a strong lasso for T = a6||c(f. 

However, if (X, £) forms a 2d-tree, or more generally if £ contains 
a subset £' such that (X, £') is a 2d-tree, then £ is a strong lasso for 
at least one fully-resolved X— tree. The proof is constructive based 
on the ordering x\,X2, ■ ■ ■ ,x n in the definition of a 2d-tree: Start 
with the tree consisting of leaves x\ and X2, and construct a fully- 
resolved tree as follows: for each i > 2, if Xi is adjacent to Xj and 
Xk in (A, £') (where j, k < i) then let Xi be the leaf that is attached 
by a new edge to a new subdivision vertex on the path connecting 
Xj and Xk in the tree so-far constructed. 

It may be of interest to explore further the connection between 
shellability and 2d-trees, and in particular, the question of when the 

2 J entails the latter property for 

£, or for some subset of £. 

(2) Suppose that T is a fully-resolved X— tree, and £ C ( 2 ) contains a 
stable triplet cover. A natural setting in which this situation arises 
is the following. Suppose (T, w) is a properly edge-weighted fully- 
resolved X— tree, and £ C P£\ has the property that, for any interior 
vertex, v, £ contains every chord xy for which a; is a closest leaf to v 
in one subtree of T — v and y is a closest leaf to v in another subtree 
of T — v. Then, as noted in Example 2 above, £ contains a stable 
triplet cover. 

Now, when £ contains a stable triplet cover for T, it follows by 



Theorem 4.1 that £ is a shellable, and thereby also a strong lasso 
for T (since any superset of a strong lasso for a tree is also a strong 
lasso for that tree). However, it is perhaps not clear how one might 
efficiently construct (T,w) from the distances induced by £, partic- 
ularly when the subset of £ corresponding to the stable triplet cover 
is not also given explicitly. Thus, in the next section we describe 
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a polynomial-time algorithm for reconstructing (T, w) whenever C 
contains some (unknown) shellable lasso for T. 

5. AN ALGORITHM FOR RECONSTRUCTING (T,U>) FROM dl TjW )\C WHEN C 
CONTAINS AN SHELLABLE LASSO FOR T. 
I X \ 

Suppose that C C ( 2 J and that T is a fully- resolved X— tree, w is a 
proper edge- weighting of T and d = d^ T ^ . Starting with £ = C add cords 
to £ and extend the domain of d to those cords, by repeated application of 
the following extension rule (1Z), described in [7j (Section 6.2, page 246): 

(1Z) Whenever x,y, z,u G X and 
'{x,y,u,zf 



i {xz} C £* ,12 £ , and 

<i(x, y) + z) < u) + <i(y, z) 

add xz to £*, and let d(x, z) := d(x, u) + d(y, z) — d(y, u). 

Let cl-ji(C) be the set of resulting set of cords obtained from the initial set 
C when this extension rule no longer yields any new cords. 

Note that c\-n{C) can be computed in polynomial time, and that d— values 
are assigned for all cords in cl^(£). Moreover, if cl-&(£) = (^), then c\-ji{C) 
is a strong lasso for T, however the converse does not hold (Example 6.2 of 
[4| provides a counterexample). 

2 J contains an shellable lasso for a fully-resolved 
X — tree T , and d = drj> tW ), for some proper edge weighting w, then cl-ji(C) = 

( „ ) . Consequently, T and w can be reconstructed in polynomial time from 
the restriction of d to C 

Proof. Suppose that £ C C is a shellable lasso for T; we will show that 
clfc(C') = and so cln(C) = ( 2 ). Suppose to the contrary that clfc(£) 

( X\ 

is a strict subset of [ 2 ), and consider any shelling ai&i, . . . ,a m b m of the 
cords in ("^) — £ (such a shelling exists by the assumption that £ is an 
shellable lasso for T). Let j G {1, . . . ,m} be the smallest index for which 
ajbj cl-ji(C). Then the condition on the shelling ensures that there exists 
pivots Xj,yj £ X — {aj,bj} so that for Y = {aj,bj,Xj,yj} we have T\Y 
is the quartet tree ajXj\\bjyj and that each cord in ( 2 ) — {ajbj} either is 
an element of £ or it occurs earlier in the ordering for the shelling than 
ajbj, and so, by the minimality assumption concerning j, all these cords lie 
in cln(£). Consequently, ajbj G cl7?.(cl7?.(£ / )) = clfc(£), a contradiction. 
Thus, our assumption that c\ti(£) is a strict subset of ( 2 ) is not possible, 
as required. 

Finally, to efficiently recover (T, w), once d has been defined on all of (^) , 
one can apply standard distance-based reconstruction methods for fully- 
resolved trees, such as the Neighbor- Joining method [6]. □ 
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